{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 3.3 线性回归的简洁实现\n",
    "\n",
    "随着深度学习框架的发展，开发深度学习应用变得越来越便利。实践中，我们通常可以用比上一节更简洁的代码来实现同样的模型。在本节中，我们将介绍如何使用tensorflow2.0推荐的keras接口更方便地实现线性回归的训练。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3.1 生成数据集\n",
    "\n",
    "我们生成与上一节中相同的数据集。其中`features`是训练数据特征，`labels`是标签。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "\n",
    "num_inputs = 2\n",
    "num_examples = 1000\n",
    "true_w = [2, -3.4]\n",
    "true_b = 4.2\n",
    "features = tf.random.normal(shape=(num_examples, num_inputs),stddev=1)\n",
    "labels = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_b\n",
    "# labels 增加随机干扰\n",
    "labels += tf.random.normal(labels.shape, stddev=0.01)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3.2 读取数据\n",
    "\n",
    "虽然tensorflow2.0对于线性回归可以直接拟合，不用再划分数据集，但我们仍学习一下读取数据的方法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow import data as tfdata\n",
    "\n",
    "batch_size = 10\n",
    "# 将训练集的特(征和标签组合\n",
    "dataset = tfdata.Dataset.from_tensor_slices((features, labels))\n",
    "# 随机读取小批量数据\n",
    "dataset = dataset.shuffle(buffer_size=num_examples)\n",
    "dataset = dataset.batch(batch_size)\n",
    "# iter 方法创建一个迭代器\n",
    "data_iter = iter(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`shuffle` 的 `buffer_size` 参数应大于等于样本数，`batch` 可以指定 `batch_size` 的分割大小。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "tf.Tensor(\n",
      "[[ 1.5882721  -0.4413251 ]\n",
      " [-0.41925693  0.692152  ]\n",
      " [-2.068257   -1.0048193 ]\n",
      " [ 0.5290812   0.07397223]\n",
      " [ 1.2481225  -0.6947398 ]\n",
      " [ 1.4212433   0.4069843 ]\n",
      " [ 0.30451813 -0.07301793]\n",
      " [-0.09720113  0.6606247 ]\n",
      " [ 0.15103829 -0.04567332]\n",
      " [ 1.4664663  -0.5699477 ]], shape=(10, 2), dtype=float32) tf.Tensor(\n",
      "[8.88295   1.0037544 3.4902809 5.0176535 9.049796  5.6652336 5.0645494\n",
      " 1.7540811 4.6641946 9.056335 ], shape=(10,), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "for (batch, (X, y)) in enumerate(dataset):\n",
    "    print(batch)\n",
    "    print(X, y)\n",
    "    break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3.3 定义模型和初始化参数\n",
    "\n",
    "`Tensorflow 2.0`推荐使用`Keras`定义网络，故使用`Keras`定义网络\n",
    "我们先定义一个模型变量`model`，它是一个`Sequential`实例。\n",
    "在`Keras`中，`Sequential`实例可以看作是一个串联各个层的容器。\n",
    "\n",
    "在构造模型时，我们在该容器中依次添加层。\n",
    "当给定输入数据时，容器中的每一层将依次推断下一层的输入尺寸。\n",
    "重要的一点是，在`Keras`中我们无须指定每一层输入的形状。\n",
    "线性回归，输入层与输出层等效为一层全连接层`keras.layers.Dense()`。\n",
    "\n",
    "`Keras` 中初始化参数由 `kernel_initializer` 和 `bias_initializer` 选项分别设置权重和偏置的初始化方式。我们从 `tensorflow` 导入 `initializers` 模块，指定权重参数每个元素将在初始化时随机采样于均值为0、标准差为0.01的正态分布。偏差参数默认会初始化为零。`RandomNormal(stddev=0.01)`指定权重参数每个元素将在初始化时随机采样于均值为0、标准差为0.01的正态分布。偏差参数默认会初始化为零。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow import keras\n",
    "from tensorflow.keras import layers\n",
    "from tensorflow import initializers as init\n",
    "\n",
    "model = keras.Sequential()\n",
    "model.add(layers.Dense(1, kernel_initializer=init.RandomNormal(stddev=0.01)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3.4 定义损失函数\n",
    "\n",
    "`Tensoflow`在`losses`模块中提供了各种损失函数和自定义损失函数的基类，并直接使用它的均方误差损失作为模型的损失函数。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow import losses\n",
    "loss = losses.MeanSquaredError()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3.5 定义优化算法\n",
    "\n",
    "同样，我们也无须自己实现小批量随机梯度下降算法。`tensorflow.keras.optimizers` 模块提供了很多常用的优化算法比如SGD、Adam和RMSProp等。下面我们创建一个用于优化model 所有参数的优化器实例，并指定学习率为0.03的小批量随机梯度下降（SGD）为优化算法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow.keras import optimizers\n",
    "trainer = optimizers.SGD(learning_rate=0.03)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.3.6 训练模型\n",
    "\n",
    "在使用`Tensorflow`训练模型时，我们通过调用`tensorflow.GradientTape`记录动态图梯度，执行`tape.gradient`获得动态图中各变量梯度。通过 `model.trainable_variables` 找到需要更新的变量，并用 `trainer.apply_gradients` 更新权重，完成一步训练。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "epoch 1, loss: 0.000247\n",
      "epoch 2, loss: 0.000103\n",
      "epoch 3, loss: 0.000102\n"
     ]
    }
   ],
   "source": [
    "num_epochs = 3\n",
    "for epoch in range(1, num_epochs + 1):\n",
    "    for (batch, (X, y)) in enumerate(dataset):\n",
    "        with tf.GradientTape() as tape:\n",
    "            l = loss(model(X, training=True), y)\n",
    "        grads = tape.gradient(l, model.trainable_variables)\n",
    "        trainer.apply_gradients(zip(grads, model.trainable_variables))\n",
    "        \n",
    "    l = loss(model(features), labels)\n",
    "    print('epoch %d, loss: %f' % (epoch, l))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面我们分别比较学到的模型参数和真实的模型参数。我们可以通过model的`get_weights()`来获得其权重（`weight`）和偏差（`bias`）。学到的参数和真实的参数很接近。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([2, -3.4], array([[ 1.9994655],\n",
       "        [-3.40126  ]], dtype=float32))"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "true_w, model.get_weights()[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4.2, array([4.199513], dtype=float32))"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "true_b, model.get_weights()[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 小结\n",
    "\n",
    "- 使用`Tensorflow`可以更简洁地实现模型。\n",
    "- `tensorflow.data`模块提供了有关数据处理的工具，`tensorflow.keras.layers`模块定义了大量神经网络的层，`tensorflow.initializers`模块定义了各种初始化方法，`tensorflow.optimizers`模块提供了模型的各种优化算法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on function mean_squared_error in module tensorflow.python.keras.losses:\n",
      "\n",
      "mean_squared_error(y_true, y_pred)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(losses.mean_squared_error)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Help on class MeanSquaredError in module tensorflow.python.keras.losses:\n",
      "\n",
      "class MeanSquaredError(LossFunctionWrapper)\n",
      " |  MeanSquaredError(reduction='auto', name='mean_squared_error')\n",
      " |  \n",
      " |  Computes the mean of squares of errors between labels and predictions.\n",
      " |  \n",
      " |  `loss = square(y_true - y_pred)`\n",
      " |  \n",
      " |  Usage:\n",
      " |  \n",
      " |  ```python\n",
      " |  mse = tf.keras.losses.MeanSquaredError()\n",
      " |  loss = mse([0., 0., 1., 1.], [1., 1., 1., 0.])\n",
      " |  print('Loss: ', loss.numpy())  # Loss: 0.75\n",
      " |  ```\n",
      " |  \n",
      " |  Usage with the `compile` API:\n",
      " |  \n",
      " |  ```python\n",
      " |  model = tf.keras.Model(inputs, outputs)\n",
      " |  model.compile('sgd', loss=tf.keras.losses.MeanSquaredError())\n",
      " |  ```\n",
      " |  \n",
      " |  Method resolution order:\n",
      " |      MeanSquaredError\n",
      " |      LossFunctionWrapper\n",
      " |      Loss\n",
      " |      builtins.object\n",
      " |  \n",
      " |  Methods defined here:\n",
      " |  \n",
      " |  __init__(self, reduction='auto', name='mean_squared_error')\n",
      " |      Initialize self.  See help(type(self)) for accurate signature.\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Methods inherited from LossFunctionWrapper:\n",
      " |  \n",
      " |  call(self, y_true, y_pred)\n",
      " |      Invokes the `LossFunctionWrapper` instance.\n",
      " |      \n",
      " |      Args:\n",
      " |        y_true: Ground truth values.\n",
      " |        y_pred: The predicted values.\n",
      " |      \n",
      " |      Returns:\n",
      " |        Loss values per sample.\n",
      " |  \n",
      " |  get_config(self)\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Methods inherited from Loss:\n",
      " |  \n",
      " |  __call__(self, y_true, y_pred, sample_weight=None)\n",
      " |      Invokes the `Loss` instance.\n",
      " |      \n",
      " |      Args:\n",
      " |        y_true: Ground truth values. shape = `[batch_size, d0, .. dN]`\n",
      " |        y_pred: The predicted values. shape = `[batch_size, d0, .. dN]`\n",
      " |        sample_weight: Optional `sample_weight` acts as a\n",
      " |          coefficient for the loss. If a scalar is provided, then the loss is\n",
      " |          simply scaled by the given value. If `sample_weight` is a tensor of size\n",
      " |          `[batch_size]`, then the total loss for each sample of the batch is\n",
      " |          rescaled by the corresponding element in the `sample_weight` vector. If\n",
      " |          the shape of `sample_weight` is `[batch_size, d0, .. dN-1]` (or can be\n",
      " |          broadcasted to this shape), then each loss element of `y_pred` is scaled\n",
      " |          by the corresponding value of `sample_weight`. (Note on`dN-1`: all loss\n",
      " |          functions reduce by 1 dimension, usually axis=-1.)\n",
      " |      \n",
      " |      Returns:\n",
      " |        Weighted loss float `Tensor`. If `reduction` is `NONE`, this has\n",
      " |          shape `[batch_size, d0, .. dN-1]`; otherwise, it is scalar. (Note `dN-1`\n",
      " |          because all loss functions reduce by 1 dimension, usually axis=-1.)\n",
      " |      \n",
      " |      Raises:\n",
      " |        ValueError: If the shape of `sample_weight` is invalid.\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Class methods inherited from Loss:\n",
      " |  \n",
      " |  from_config(config) from builtins.type\n",
      " |      Instantiates a `Loss` from its config (output of `get_config()`).\n",
      " |      \n",
      " |      Args:\n",
      " |          config: Output of `get_config()`.\n",
      " |      \n",
      " |      Returns:\n",
      " |          A `Loss` instance.\n",
      " |  \n",
      " |  ----------------------------------------------------------------------\n",
      " |  Data descriptors inherited from Loss:\n",
      " |  \n",
      " |  __dict__\n",
      " |      dictionary for instance variables (if defined)\n",
      " |  \n",
      " |  __weakref__\n",
      " |      list of weak references to the object (if defined)\n",
      "\n"
     ]
    }
   ],
   "source": [
    "help(losses.MeanSquaredError)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
